-
Notifications
You must be signed in to change notification settings - Fork 8k
[model] support for microsoft's Phi-4-mini #9734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @ctx289, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request integrates Microsoft's Phi-4-mini-instruct model, a compact yet powerful language model known for its strong reasoning capabilities and efficiency, into the system. The changes involve defining a dedicated chat template to ensure proper interaction formatting and registering the model with its official sources, making it readily available for users seeking high-quality, resource-efficient AI solutions. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request successfully adds support for Microsoft's Phi-4-mini model, including updates to the documentation, registration of the new model's template, and integration into the model group constants. The changes are well-implemented and follow the existing patterns in the codebase.
hiyouga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
🎯 Brief Introduction
Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures.
Primary Use Cases
The model is intended for broad multilingual commercial and research use. The model provides uses for general purpose AI systems and applications which require:
The model is designed to accelerate research on language and multimodal models, for use as a building block for generative AI powered features.
Model
Usage for Training
Create a new file examples/train_lora/phi4_mini_lora_sft.yaml with the following content:
used
result
template: phi
template: phi4_mini
We conducted a comparison of the loss metrics between the
phitemplate andphi4_minitemplate. The results indicate that the phi4_minitemplate is better suited for the training process of the current phi4_minimodel (demonstrating better initial loss convergence and format compatibility with the target model.